ProfileGrids solve the large alignment visualization problem : influenza hemagglutinin example

نویسندگان

  • Alberto I Roca
  • Aaron C Abajian
  • David J Vigerust
چکیده

Introduction The explosion in biological sequence information has led to the generation of large multiple sequence alignments (MSA). For example, the biggest protein family alignment currently in the Pfam database (Wellcome Trust-Sanger Institute) has over 288,000 sequences1. A new generation of alignment programs, such as Clustal Omega2, are available that allow the routine calculation of such large alignments. However, a Nature Methods review3 noted the lack of software tools for visualizing the results of large alignment calculations. Specifically, there was a call for overcoming the conceptual and technical limitations of large data sets to allow one to navigate visually both an overview and the details of an alignment, while having mechanisms to query annotated data. We point out that this conceptual limitation was solved in late 2008 by the introduction of ProfileGrids as a new paradigm for visualizing large multiple sequence alignments4. Here, we report that the remaining technical limitations have been overcome with version 2.0 of the JProfileGrid software, and that therefore, the large alignment visualization problem has now been solved. We use the influenza hemagglutinin protein family as a case study to demonstrate the new features of the software. Abstract Large multiple sequence alignments are a challenge for current visualization programs. ProfileGrids are a solution that reduces alignments to a matrix, color-shaded according to the residue frequency at each column position. ProfileGrids are not limited by the number of sequences and so solves this visualization problem. We demonstrate the new metadata searching and grep filtering features of the JProfileGrid version 2.0 software on an alignment of 11,900 hemagglutinin protein sequences. JProfileGrid is free and available from http://www.ProfileGrid.org.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

BACKGROUND The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment ...

متن کامل

Mutational dynamics of influenza A viruses: a principal component analysis of hemagglutinin sequences of subtype H1

A principal component analysis of a multiple sequence alignement of hemagglutinin sequences of subtype H1 has been performed, the sequences being encoded using the aminoacid property that maximizes the weight of the major component. In the case of this alignment, it happens to be a wellknown hydrophobicity scale. Interestingly, sequences coming from human have large positive amplitudes along th...

متن کامل

Sequence Analysis and Phylogenetic Study of Hemagglutinin Gene of H9N2 Subtype of Avian Influenza Virus Isolated during 1998-2002 in Iran

Sequence analysis and phylogenetic study of hemagglutinin (HA) gene of H9N2 subtype of avian influenza virus isolates (outbreaks of 1998-2002) in Tehran province (Iran) were studied. Two sets of forward and reverse primers in highly conserved regions, based on sequences of HA gene in Genbank, were designed. PCR products of a 430-bp fragment of 16 isolates were sequenced and then were aligned wi...

متن کامل

A new approach for data visualization problem

Data visualization is the process of transforming data, information, and knowledge into visual form, making use of humans’ natural visual capabilities which reveals relationships in data sets that are not evident from the raw data, by using mathematical techniques to reduce the number of dimensions in the data set while preserving the relevant inherent properties. In this paper, we formulated d...

متن کامل

A Reverse transcription-PCR assay for detection of type A influenza virus and differentiation of avian H7 subtype

Abstract : Avian influenza virus (AIV) infection is a major cause of influenza mortality in birds and can cause human mortality and morbidity. Although the risk of infection with avian influenza virus (AIV) is generally low for most people, the pathogenic virus can cross the species barrier and acquires the ability to infect and be transmitted among the human population; therefore the ra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016